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ABSTRACT 

We  present  a  method  for  the  detection  of  sleep  stages  using 
the  EEG  (electroencephalogram).  The  method  consists  of 
four  steps:  segmentation;  parameter  extraction;  cluster  anal¬ 
ysis;  and  classification.  The  parameters  we  compared  were 
the  parameters  of  Hjorth,  the  harmonic  parameters  and  the 
relative  band  energy.  For  cluster  analysis  we  used  a  mod¬ 
ified  version  of  the  K-means  algorithm.  It  is  shown  that 
the  investigated  parameters  are  capable  of  extracting  infor¬ 
mation  from  the  EEG  relevant  for  sleep  stage  scoring.  Us¬ 
ing  the  modified  K-means  algorithm  it  is  possible  to  find 
‘similar’  segments  and  hence  automate  the  detection  of  sleep 
stages.  However,  extra  information  e.g.,  the  ECG  (electro¬ 
cardiogram)  or  the  EOG  (electrooculogram)  is  probably  nec¬ 
essary  for  a  clear  discrimination  between  the  different  sleep 
stages.  -Keywords:  automatic  sleep  scoring,  EEG  analysis 


1.  INTRODUCTION 

An  EEG  (electroencephalogram),  a  measurement  of  the  time- 
varying  potential  differences  between  electrodes  fixed  on  the 
scalp,  is  an  important  clinical  aid  used  by  the  neurologist 
for  the  diagnosis  of  sleep  disorders.  Sleep  is  a  non-uniform 
biological  state  and  can  be  divided  into  2  main  types:  rapid 
eye  movement  (REM)  and  non-rapid  eye  movement  (NREM) 
sleep.  The  latter  is  subdivided  into  stages  1,  2,  3  and  4 
according  to  the  current  sleep  scoring  standard  proposed  by 
Rechtschaffen  and  Kales  [1]. 

A  review  of  the  EEG  can  reveal  unusual  patterns.  How¬ 
ever,  a  complete  visual  inspection  of  a  long-term  EEG  record¬ 
ing  is  a  time-consuming  and  difficult  task.  So,  a  method  to 
facilitate  the  review  would  be  highly  appreciated.  In  the  past 
a  number  of  automated  sleep  stage  scoring  methods  were 
proposed  based  on  EEG  records,  sometimes  in  combination 
with  the  EOG  (electrooculogram)  and  the  EMG  (electromyo¬ 
gram)  [2,3].  These  methods  have  in  common  that  they  ex¬ 
tract  certain  features  from  the  recordings  and  apply  a  num¬ 
ber  of  rules  to  classify  these  segments  into  one  of  the  5  sleep 
stages. 

We  present  a  method  that  uses  a  number  of  time  and  fre¬ 
quency  domain  parameters  obtained  from  a  segmented  sleep 
EEG  recording  to  construct  a  vector  space.  By  using  a 
slightly  modified  version  of  the  K-means  algorithm  it  is  pos¬ 
sible  to  find  clusters  in  this  vector  space.  By  assigning  a 
label  to  each  cluster  according  to  the  manual  scoring  of  the 
respective  codebook  vectors,  we  achieve  a  (semi-)  automatic 
detection  of  the  sleep  stages  using  the  EEG.  The  method  in 
essence  only  searches  for  ‘similar’  segments  and  thus  no  a  pri¬ 
ori  rules  need  to  be  incorporated,  leaving  the  final  decision 
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Figure  1.  Schematic  overview  of  the  algorithm. 

to  the  human  reviewer. 

2.  METHOD 


The  method  we  developed  consists  of  4  consecutive  steps, 
as  depicted  in  figure  1:  segmentation;  parameter  extraction; 
cluster  analysis;  and  classification.  Three  sets  of  parameters 
were  compared:  the  parameters  of  Hjorth,  the  harmonic  pa¬ 
rameters  and  the  relative  band  energy.  The  cluster  analysis 
was  performed  by  using  the  K-means  clustering  algorithm. 

2.1.  Segmentation 

In  the  segmentation  step,  one  channel  of  the  sampled  EEG, 
x[n],  is  broken  down  into  sections  with  a  fixed  length,  called 
segments.  We  choose  the  segment  length  to  be  10s  because 
we  had  the  complete  scoring  for  a  sleep  EEG  in  steps  of  10s. 
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2.2.1.  Parameters  of  Hjorth 

Based  on  the  variance  of  the  signal  x[n]  and  its  first  and 
second  derivative  (differences)  in  a  segment,  Hjorth  derived  3 
parameters,  sometimes  called  descriptors,  for  the  quantifica¬ 
tion  of  an  EEG.  If  we  write  the  variance  of  the  i-th  derivative 
of  x  [n]  as  <t,  (with  a o  being  the  variance  of  a:[n]),  then  the 
parameters  of  Hjorth  are  defined  as  follows  [4]: 
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It  is  possible  to  relate  these  parameters  to  the  moments  of  the 
spectral  density  function  Sxx(f)  [5],  showing  that  mobility 
is  a  measure  for  the  center  frequency  and  that  complexity  is 
a  measure  for  the  bandwidth  of  the  signal. 

It  is  also  shown  that  the  parameters  of  Hjorth  give  a  valid 
description  of  an  EEG  only  if  the  signal  has  a  symmetric 
probability  density  function  with  only  one  maximum.  Is  must 
also  be  noted  that  the  accuracy  by  which  the  parameter  com¬ 
plexity  can  be  computed  is  limited.  This  is  due  to  the  fact 
that  one  must  calculate  the  first  and  second  derivatives  and 
take  the  ratio  between  them  and  thus  one  possibly  amplifies 
the  noise.  Therefore,  to  reduced  the  influence  of  high  noise 
frequencies,  one  should  band  filter  the  EEG. 

Nonetheless,  these  parameters  can  be  valuable  in  a  practi¬ 
cal  analysis  if  the  EEG  patterns  to  be  analysed  have  a  simple 
character,  e.g.,  sleep  recordings,  and  because  the  parameters 
can  be  easily  computed  from  the  time  signal  [5]. 

Calculation  of  the  parameters  of  Hjorth  for  every  segment 
results  in  a  3  dimensional  vector  space. 


2.3.  Cluster  analysis 

The  goal  in  cluster  analysis  is  to  categorize  (cluster)  a  number 
of  points  into  K  groups  or  clusters  so  that  the  distortion,  i.e. 
the  within-cluster  sum  of  distances  between  member  points 
and  the  centroid  (also  called  the  codebook  vector),  is  mini¬ 
mized.  In  general,  it  is  not  possible  to  find  an  analytical 
solution  that  results  in  an  optimal  global  minimum.  There¬ 
fore,  one  uses  an  algorithm  that  guarantees  at  least  to  find  a 
local  minimum. 

We  used  a  slightly  modified  version  of  the  K-means  algo¬ 
rithm  [9]  to  find  the  clusters  and  corresponding  codebook 
vectors.  In  the  basic  K-means  algorithm  one  starts  with  K 
initial  codebook  vectors  and  these  are  iteratively  adjusted 
until  a  (local)  minimum  is  found.  The  final  result  is,  how¬ 
ever,  very  sensitive  to  the  selection  of  the  initial  codebook 
vectors.  In  our  implementation,  the  modified  K-means  al¬ 
gorithm,  we  start  with  only  2  initial  codebook  vectors  and 
apply  the  basic  K-means  algorithm  to  obtain  2  good  code¬ 
book  vectors.  Then  we  iteratively  increase  the  number  of 
clusters  until  the  desired  number  K  is  reached  by  dividing 
the  greatest  cluster  (the  cluster  who’s  within-cluster  sum  of 
distances  between  member  points  and  the  centroid  is  great¬ 
est)  into  2  new  clusters  and  applying  the  K-means  algorithm 
again. 


2.2.2.  Harmonic  parameters 

Using  an  estimate  of  the  spectral  density  function  Sxx(f), 
the  harmonic  parameters  [6]  are  the  center  frequency,  the 
bandwidth  and  the  value  at  the  center  frequency,  defined  as 
follows: 


It  is  also  important  to  mention  that  we  choose  the  so-called 
mini-max  centre  as  a  codebook  vector  of  a  cluster  instead  of 
the  centroid.  The  mini-max  centre  is  the  point  in  cluster 
who’s  maximal  within-cluster  distance,  is  minimal.  As  a  re¬ 
sult  we  always  obtain  codebook  vectors  that  correspond  to  a 
segment  of  the  EEG. 
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These  parameters  are  calculated  using  the  spectral  density 
function  between  _/x  and  fH,  thus  allowing  to  investigate  a 
specific  band  in  the  EEG,  instead  of  the  whole  EEG  spec¬ 
trum.  The  spectral  density  function  Sxx(f)  was  estimated 
using  the  method  of  Welch  [7]. 

2.2.3.  Relative  band  energy 

Using  7  predefined  frequency  bands,  the  relative  band  en¬ 
ergy  is  defined  as  the  ratio  between  the  energy  in  a  band  to 
the  total  energy.  The  7  bands  we  used  are  given  in  table  1, 
in  accordance  with  [8]. 


We  choose  K  equal  to  20,  and  afterwards  reduced  the  num¬ 
ber  of  clusters  to  5  (number  of  sleep  stages)  by  grouping  some 
clusters  so  that  non-spherical  clusters  could  be  modeled. 

2.4.  Classification 

The  final  step  is  classification.  Every  point  (corresponding  to 
a  segment  of  the  EEG)  in  a  cluster,  is  scored  according  to  the 
(manual)  scoring  of  the  segments  corresponding  to  the  con¬ 
structed  codebook  vectors.  The  classification  in  sleep  stages 
of  the  whole  EEG  thus  only  requires  the  manual  scoring  of 
20  segments. 


3.  RESULTS 

To  verify  the  method  we  applied  the  algorithm  to  one  6-hour 
sleep  EEG  recording.  Twenty-one  electrodes  were  placed  ac¬ 
cording  to  the  international  10-20  system,  with  six  additional 
lateral  electrodes  to  cover  the  temporal  regions.  The  sleep 
EEG  had  been  visually  scored  by  an  experienced  neurologist 
in  steps  of  10s. 
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Figure  2.  The  vector  space  constructed  with  the  harmonic  pa¬ 
rameters  for  the  channel  F7-T7. 

The  algorithm  as  described  above  was  applied  to  the  chan¬ 
nel  F7-T7  for  the  3  sets  of  parameters.  The  resulting  vector 
space  for  the  harmonic  parameters  is  depicted  in  figure  2  and 
this  for  the  first  2  hours  of  the  EEG.  The  results  with  the  2 
other  sets  of  parameters,  the  parameters  of  Hjorth  and  the 
relative  band  energy,  are  very  similar.  The  regions  (clus¬ 
ters)  corresponding  to  the  different  sleep  stages  (as  visually 
scored)  are  indicated. 

Figure  2  shows  that  stage  w  (awake)  and  stage  1  can  be 
clearly  distinguished.  The  clusters  corresponding  to  stage 
2  and  stage  3  are  somewhat  overlapping,  hence  making  the 
automatic  detection  harder.  However,  is  should  be  noted  that 
even  experienced  neurologists  have  difficulty  in  classifying 
the  different  stages  without  using  extra  information  (e.g., 
ECG  and  EOG).  Furthermore,  automatic  detection  of  the 
sleep  stages  is  complicated  by  the  presence  of  so-called  sleep 
spindles,  short  waveforms  (2-3  s)  with  a  frequency  of  12-14 
Hz.  The  parameter  vectors  associated  with  these  spindles 
are  scattered  in  the  constructed  vector  space. 


Figure  3  shows  the  20  clusters  found  in  the  vector  space 
obtained  after  cluster  analysis  with  the  modified  K-means 
algorithm.  Note  that  the  different  parameters  had  to  be  nor¬ 
malized  prior  to  the  application  of  the  clustering.  In  figure  4 
the  final  classification  is  depicted.  As  suspected,  sleep  spin¬ 
dles  are  not  being  correctly  classified  due  to  the  fact  that 
the  K-means  algorithm  searches  for  spherical  clusters.  We 
suggest  altering  the  method  so  that  in  a  fist  step  the  spin¬ 
dles  are  being  detected  and  in  a  second  step  the  detection 
of  the  sleep  stages  follows,  without  taking  into  account  the 
segments  containing  a  detected  spindles. 

4.  CONCLUSION  AND  DISCUSSION 

The  parameters  we  investigated,  the  parameters  of  Hjorth, 
the  harmonic  parameters  and  the  relative  band  energy,  are 
capable  of  extracting  relevant  information  from  the  EEG  us¬ 
able  for  sleep  stage  classification.  However,  it  should  be  noted 
that  extra  information  (e.g.,  ECG  and  EOG)  is  needed  for  a 


Figure  3.  The  constructed  vector  space  after  cluster  analysis 
with  the  k-means  algorithm. 


Figure  4.  The  constructed  vector  space  after  classification  of 
the  clusters. 

clear  discrimination  between  the  different  sleep  stages.  Prob¬ 
ably  the  method  will  perform  better  if  the  information  con¬ 
tained  in  all  the  channels  is  being  used.  In  addition,  the 
method  has  to  be  validated  using  the  EEG  from  different 
patients. 

The  use  of  contextual  information  can  probably  enhance 
the  agreement  between  the  classification  obtained  by  the  al¬ 
gorithm  and  the  visual  scoring  of  the  expert.  Instead  of  try¬ 
ing  to  mimic  the  classification  of  the  expert,  our  method 
essentially  searches  for  ‘similar’  segments  in  the  EEG.  By 
presenting  the  representative  segments  of  the  EEG  we  leave 
the  final  decision  to  the  neurologist. 
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